Skip to main content

Scraping Pros Python SDK

Extract data from any website at scale — without managing browsers, proxies, retries, or rate limits. Submit a list of URLs, stream results as they finish, and let the server handle the hard parts.

Install

Requires Python 3.10+.

pip install scrapingpros
No signup needed

Use the demo token demo_6x595maoA6GdOdVb for testing — 5,000 credits/month, 30 req/min. Swap it for your own token when you're ready to ship.

The one example you need

This is the pattern for 90% of production use cases. Submit many URLs, stream results back as each one completes, handle failures individually:

from scrapingpros import ScrapingPros

client = ScrapingPros("demo_6x595maoA6GdOdVb")

batch = client.submit_batch("daily-scrape", [
{"url": "https://example.com/1", "custom_id": "tour-1", "browser": True},
{"url": "https://example.com/2", "custom_id": "tour-2", "browser": True},
{"url": "https://example.com/3", "custom_id": "tour-3", "browser": True},
# ... works with hundreds or tens of thousands of URLs
])

for result in batch.iter_results():
if result.guidance.success:
save_to_db(my_id=result.custom_id, content=result.content)
else:
log_failure(my_id=result.custom_id, reason=result.guidance.error_type)

# Progress is always available on the batch object
if batch.completed_count % 100 == 0:
print(f"{batch.pct:.1f}% — ETA {batch.eta_seconds:.0f}s")

print(f"Done: {batch.success_count} OK, {batch.failed_count} failed")

What this handles for you:

  • Scaling — submit 10 URLs or 50,000. The server manages workers. Your code stays the same.
  • Anti-bot — CAPTCHA detection, proxy rotation, IP rotation, browser fingerprinting. Automatic.
  • Streaming — results yield as soon as each job completes, not after everything is done. Memory stays flat.
  • Progress trackingbatch.pct, batch.eta_seconds, batch.success_count, batch.failed_count update live.
  • Failure visibility — every yielded result tells you if it worked and why it didn't. No silent drops.
  • Resume on crash — persist batch.collection_id + batch.run_id, reattach later with client.get_batch().
  • Rate-limit math — the SDK polls the server efficiently (1 API call per tick, regardless of batch size).

Full Batch API docs

Need just one URL?

Sometimes you need a single scrape for a webhook handler or a quick check:

result = client.scrape("https://example.com", format="markdown", browser=True)
print(result.content)

Single scrape docs — use this for one-off requests, debugging, or anything where you need the answer synchronously.

Async / asyncio?

Every method has an async twin. Same surface, same endpoints — only the local I/O loop changes:

from scrapingpros import AsyncClient   # alias for AsyncScrapingPros (since v0.5.1)

async with AsyncClient("your_token") as client:
batch = await client.submit_batch("daily", items)
async for result in batch.iter_results():
await save_async(result)
Quick note on naming

AsyncScrapingPros (and its alias AsyncClient) does not make scraping faster on its own — it only changes the I/O loop. To scale to many URLs, the right tool is submit_batch() on either client.

Want a list back, not a stream?

If you'd rather block until every URL is done and get a flat list[ScrapeResponse] (drop-in replacement for scrape_many), use batch_scrape:

results = client.batch_scrape([
{"url": u, "custom_id": pid, "browser": True}
for pid, u in catalog.items()
])
for r in results:
if r.guidance.success:
save(r.custom_id, r.content)

Same server-side scaling as submit_batch, simpler signature. Available on both SyncClient and AsyncClient since v0.5.1.

Authentication

# Explicit token
client = ScrapingPros("your_token")

# From environment variable
# export SP_TOKEN=your_token
client = ScrapingPros()

Credits

TypeCredits
Simple HTTP scrape1
Browser scrape5
  • Refunded automatically on infrastructure errors.
  • Track usage: client.credits_charged, client.quota_remaining.
  • client.billing() for monthly summaries, client.plans() for pricing.

Where next